Symbolic Representation of Time Series: A Hierarchical Coclustering Formalization

نویسندگان

  • Alexis Bondu
  • Marc Boullé
  • Antoine Cornuéjols
چکیده

The choice of an appropriate representation remains crucial for mining time series, particularly to reach a good trade-o between the dimensionality reduction and the stored information. Symbolic representations constitute a simple way of reducing the dimensionality by turning time series into sequences of symbols. SAXO is a data-driven symbolic representation of time series which encodes typical distributions of data points. This approach was rst introduced as a heuristic algorithm based on a regularized coclustering approach. The main contribution of this article is to formalize SAXO as a hierarchical coclustering approach. The search for the best symbolic representation given the data is turned into a model selection problem. Comparative experiments demonstrate the bene t of the new formalization, which results in representations that drastically improve the compression of data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian coclustering of Anopheles gene expression time series: study of immune defense response to multiple experimental challenges.

We present a method for Bayesian model-based hierarchical coclustering of gene expression data and use it to study the temporal transcription responses of an Anopheles gambiae cell line upon challenge with multiple microbial elicitors. The method fits statistical regression models to the gene expression time series for each experiment and performs coclustering on the genes by optimizing a joint...

متن کامل

Feature Extraction over Multiple Representations for Time Series Classification

We suggest a simple yet effective and parameter-free feature construction process for time series classification. Our process is decomposed in three steps: (i) we transform original data into several simple representations; (ii) on each representation, we apply a coclustering method; (iii) we use coclustering results to build new features for time series. It results in a new transactional (i.e....

متن کامل

A Symbolic Representation of Time Series Employing Key-Sequences and a Hierarchical Approach

Efficiently and accurately searching for similarities among time series and discovering interesting patterns is an important and non-trivial problem. There is a lot of prior work e.g., F-index introduced by Agrawal et al, STindex proposed by Faloutsos et al, and PAA suggested by Keogh et al. In this paper we suggest a new method: HFVQA (Hierarchical Frequency-based Vector Quantized Approximatio...

متن کامل

Cats & Co: Categorical Time Series Coclustering

We suggest a novel method of clustering and exploratory analysis of temporal event sequences data (also known as categorical time series) based on three-dimensional data grid models. A data set of temporal event sequences can be represented as a data set of three-dimensional points, each point is defined by three variables: a sequence identifier, a time value and an event value. Instantiating d...

متن کامل

Algorithms for Segmenting Time Series

As with most computer science problems, representation of the data is the key to ecient and eective solutions. Piecewise linear representation has been used for the representation of the data. This representation has been used by various researchers to support clustering, classication, indexing and association rule mining of time series data. A variety of algorithms have been proposed to obtain...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015